{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Manipulating Memory at Runtime\n",
    "\n",
    "In this notebook, we cover how to use the `Memory` class to build an agentic workflow with dynamic memory.\n",
    "\n",
    "Specifically, we will build a workflow where a user can upload a file, and pin that to the context of the LLM (i.e. like the file context in Cursor).\n",
    "\n",
    "By default, as the short-term memory fills up and is flushed, it will be passed to memory blocks for processing as needed (extracting facts, indexing for retrieval, or for static blocks, ignoring it).\n",
    "\n",
    "With this notebook, the intent is to show how memory can be managed and manipulated at runtime, beyond the already existing functionality described above."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Setup\n",
    "\n",
    "For our workflow, we will use OpenAI as our LLM."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!pip install llama-index-core llama-index-llms-openai"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import os\n",
    "\n",
    "os.environ[\"OPENAI_API_KEY\"] = \"sk-...\""
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Workflow Setup\n",
    "\n",
    "Our workflow will be fairly straightfoward. There will be two main entry points\n",
    "\n",
    "1. Adding/Removing files from memory \n",
    "2. Chatting with the LLM\n",
    "\n",
    "Using the `Memory` class, we can introduce memory blocks that hold our static context."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import re\n",
    "from typing import List, Literal, Optional\n",
    "from pydantic import Field\n",
    "from llama_index.core.memory import Memory, StaticMemoryBlock\n",
    "from llama_index.core.llms import LLM, ChatMessage, TextBlock, ImageBlock\n",
    "from llama_index.core.workflow import (\n",
    "    Context,\n",
    "    Event,\n",
    "    StartEvent,\n",
    "    StopEvent,\n",
    "    Workflow,\n",
    "    step,\n",
    ")\n",
    "\n",
    "\n",
    "class InitEvent(StartEvent):\n",
    "    user_msg: str\n",
    "    new_file_paths: List[str] = Field(default_factory=list)\n",
    "    removed_file_paths: List[str] = Field(default_factory=list)\n",
    "\n",
    "\n",
    "class ContextUpdateEvent(Event):\n",
    "    new_file_paths: List[str] = Field(default_factory=list)\n",
    "    removed_file_paths: List[str] = Field(default_factory=list)\n",
    "\n",
    "\n",
    "class ChatEvent(Event):\n",
    "    pass\n",
    "\n",
    "\n",
    "class ResponseEvent(StopEvent):\n",
    "    response: str\n",
    "\n",
    "\n",
    "class ContextualLLMChat(Workflow):\n",
    "    def __init__(self, memory: Memory, llm: LLM, **workflow_kwargs):\n",
    "        super().__init__(**workflow_kwargs)\n",
    "        self._memory = memory\n",
    "        self._llm = llm\n",
    "\n",
    "    def _path_to_block_name(self, file_path: str) -> str:\n",
    "        return re.sub(r\"[^\\w-]\", \"_\", file_path)\n",
    "\n",
    "    @step\n",
    "    async def init(self, ev: InitEvent) -> ContextUpdateEvent | ChatEvent:\n",
    "        # Manage memory\n",
    "        await self._memory.aput(ChatMessage(role=\"user\", content=ev.user_msg))\n",
    "\n",
    "        # Forward to chat or context update\n",
    "        if ev.new_file_paths or ev.removed_file_paths:\n",
    "            return ContextUpdateEvent(\n",
    "                new_file_paths=ev.new_file_paths,\n",
    "                removed_file_paths=ev.removed_file_paths,\n",
    "            )\n",
    "        else:\n",
    "            return ChatEvent()\n",
    "\n",
    "    @step\n",
    "    async def update_memory_context(self, ev: ContextUpdateEvent) -> ChatEvent:\n",
    "        current_blocks = self._memory.memory_blocks\n",
    "        current_block_names = [block.name for block in current_blocks]\n",
    "\n",
    "        for new_file_path in ev.new_file_paths:\n",
    "            if new_file_path not in current_block_names:\n",
    "                if new_file_path.endswith((\".png\", \".jpg\", \".jpeg\")):\n",
    "                    self._memory.memory_blocks.append(\n",
    "                        StaticMemoryBlock(\n",
    "                            name=self._path_to_block_name(new_file_path),\n",
    "                            static_content=[ImageBlock(path=new_file_path)],\n",
    "                        )\n",
    "                    )\n",
    "                elif new_file_path.endswith((\".txt\", \".md\", \".py\", \".ipynb\")):\n",
    "                    with open(new_file_path, \"r\") as f:\n",
    "                        self._memory.memory_blocks.append(\n",
    "                            StaticMemoryBlock(\n",
    "                                name=self._path_to_block_name(new_file_path),\n",
    "                                static_content=f.read(),\n",
    "                            )\n",
    "                        )\n",
    "                else:\n",
    "                    raise ValueError(f\"Unsupported file: {new_file_path}\")\n",
    "        for removed_file_path in ev.removed_file_paths:\n",
    "            # Remove the block from memory\n",
    "            named_block = self._path_to_block_name(removed_file_path)\n",
    "            self._memory.memory_blocks = [\n",
    "                block\n",
    "                for block in self._memory.memory_blocks\n",
    "                if block.name != named_block\n",
    "            ]\n",
    "\n",
    "        return ChatEvent()\n",
    "\n",
    "    @step\n",
    "    async def chat(self, ev: ChatEvent) -> ResponseEvent:\n",
    "        chat_history = await self._memory.aget()\n",
    "        response = await self._llm.achat(chat_history)\n",
    "        return ResponseEvent(response=response.message.content)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Using the Workflow\n",
    "\n",
    "Now that we have our chat workflow defined, we can try it out! You can use any file, but for this example, we will use a few dummy files."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!wget https://mediaproxy.tvtropes.org/width/1200/https://static.tvtropes.org/pmwiki/pub/images/shrek_cover.png -O ./image.png\n",
    "!wget https://raw.githubusercontent.com/run-llama/llama_index/refs/heads/main/llama-index-core/llama_index/core/memory/memory.py -O ./memory.py"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from llama_index.core.memory import Memory\n",
    "from llama_index.llms.openai import OpenAI\n",
    "\n",
    "llm = OpenAI(model=\"gpt-4.1-nano\")\n",
    "\n",
    "memory = Memory.from_defaults(\n",
    "    session_id=\"my_session\",\n",
    "    token_limit=60000,\n",
    "    chat_history_token_ratio=0.7,\n",
    "    token_flush_size=5000,\n",
    "    insert_method=\"user\",\n",
    ")\n",
    "\n",
    "workflow = ContextualLLMChat(\n",
    "    memory=memory,\n",
    "    llm=llm,\n",
    "    verbose=True,\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We can simulate a user adding a file to memory, and then chatting with the LLM."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Running step init\n",
      "Step init produced event ContextUpdateEvent\n",
      "Running step update_memory_context\n",
      "Step update_memory_context produced event ChatEvent\n",
      "Running step chat\n",
      "Step chat produced event ResponseEvent\n",
      "--------------------------------\n",
      "This file contains the implementation of a sophisticated, asynchronous memory management system designed for conversational AI or chat-based applications. Its main components and functionalities include:\n",
      "\n",
      "1. **Memory Block Abstraction (`BaseMemoryBlock`)**:\n",
      "   - An abstract base class defining the interface for memory blocks.\n",
      "   - Subclasses must implement methods to asynchronously get (`aget`) and put (`aput`) content.\n",
      "   - Optional truncation (`atruncate`) to manage size.\n",
      "\n",
      "2. **Memory Management Class (`Memory`)**:\n",
      "   - Orchestrates overall memory handling, including:\n",
      "     - Maintaining a FIFO message queue with token size limits.\n",
      "     - Managing multiple memory blocks with different priorities.\n",
      "     - Handling insertion of memory content into chat history.\n",
      "     - Truncating memory blocks when token limits are exceeded.\n",
      "     - Formatting memory blocks into templates for inclusion in chat messages.\n",
      "     - Managing the lifecycle of chat messages via an SQL store (`SQLAlchemyChatStore`).\n",
      "\n",
      "3. **Key Functionalities**:\n",
      "   - **Token Estimation**: Methods to estimate token counts for messages, blocks, images, and audio.\n",
      "   - **Queue Management (`_manage_queue`)**: Ensures the message queue stays within token limits by archiving and moving old messages into memory blocks, maintaining conversation integrity.\n",
      "   - **Memory Retrieval (`aget`)**: Fetches chat history combined with memory block content, formatted via templates, ready for use in conversations.\n",
      "   - **Memory Insertion**: Inserts memory content into chat history either as system messages or appended to user messages, based on configuration.\n",
      "   - **Asynchronous Operations**: Many methods are async, allowing non-blocking I/O with the chat store and memory blocks.\n",
      "   - **Synchronous Wrappers**: Synchronous methods wrap async calls for convenience.\n",
      "\n",
      "4. **Supporting Functions and Defaults**:\n",
      "   - Unique key generation for chat sessions.\n",
      "   - Default memory block templates.\n",
      "   - Validation and configuration logic for memory parameters.\n",
      "\n",
      "Overall, this code provides a flexible, priority-based, token-aware memory system that integrates with a chat history stored in a database, enabling long-term memory, context management, and conversation continuity in AI chat systems.\n"
     ]
    }
   ],
   "source": [
    "response = await workflow.run(\n",
    "    user_msg=\"What does this file contain?\",\n",
    "    new_file_paths=[\"./memory.py\"],\n",
    ")\n",
    "\n",
    "print(\"--------------------------------\")\n",
    "print(response.response)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Great! Now, we can simulate a user removing that file, and adding a new one."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Running step init\n",
      "Step init produced event ContextUpdateEvent\n",
      "Running step update_memory_context\n",
      "Step update_memory_context produced event ChatEvent\n",
      "Running step chat\n",
      "Step chat produced event ResponseEvent\n",
      "--------------------------------\n",
      "The file contains an image of the animated movie poster for \"Shrek.\" It features various characters from the film, including Shrek, Fiona, Donkey, Puss in Boots, and others, set against a bright, colorful background.\n"
     ]
    }
   ],
   "source": [
    "response = await workflow.run(\n",
    "    user_msg=\"What does this next file contain?\",\n",
    "    new_file_paths=[\"./image.png\"],\n",
    "    removed_file_paths=[\"./memory.py\"],\n",
    ")\n",
    "\n",
    "print(\"--------------------------------\")\n",
    "print(response.response)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "It works! Now, you've learned how to manage memory in a custom workflow. Beyond just letting short-term memory flush into memory blocks, you can manually manipulate the memory blocks at runtime as well."
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": ".venv",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
